1,776,239 research outputs found
The R-Package 'surveillance'
This document gives an introduction to the R-Package 'surveillance' containing tools for outbreak detection in routinely collected surveillance data. The package contains an implementation of the procedures described by Stroup et al. (1989), Farrington et al. (1996) and the system used at the Robert Koch Institute, Germany. For evaluation purposes, the package contains example data sets and functionality to generate surveillance data by simulation. To compare the algorithms, benchmark numbers like sensitivity, specificity, and detection delay can be computed for a set of time series. Being an open-source package it should be easy to integrate new algorithms; as an example of this process, a simple Bayesian surveillance algorithm is described, implemented and evaluated
Multilabel Classification with R Package mlr
We implemented several multilabel classification algorithms in the machine
learning package mlr. The implemented methods are binary relevance, classifier
chains, nested stacking, dependent binary relevance and stacking, which can be
used with any base learner that is accessible in mlr. Moreover, there is access
to the multilabel classification versions of randomForestSRC and rFerns. All
these methods can be easily compared by different implemented multilabel
performance measures and resampling methods in the standardized mlr framework.
In a benchmark experiment with several multilabel datasets, the performance of
the different methods is evaluated.Comment: 18 pages, 2 figures, to be published in R Journal; reference
correcte
Introduction to the R package TDA
We present a short tutorial and introduction to using the R package TDA,
which provides some tools for Topological Data Analysis. In particular, it
includes implementations of functions that, given some data, provide
topological information about the underlying space, such as the distance
function, the distance to a measure, the kNN density estimator, the kernel
density estimator, and the kernel distance. The salient topological features of
the sublevel sets (or superlevel sets) of these functions can be quantified
with persistent homology. We provide an R interface for the efficient
algorithms of the C++ libraries GUDHI, Dionysus and PHAT, including a function
for the persistent homology of the Rips filtration, and one for the persistent
homology of sublevel sets (or superlevel sets) of arbitrary functions evaluated
over a grid of points. The significance of the features in the resulting
persistence diagrams can be analyzed with functions that implement recently
developed statistical methods. The R package TDA also includes the
implementation of an algorithm for density clustering, which allows us to
identify the spatial organization of the probability mass associated to a
density function and visualize it by means of a dendrogram, the cluster tree
Monitoring data in R with the lumberjack package
Monitoring data while it is processed and transformed can yield detailed
insight into the dynamics of a (running) production system. The lumberjack
package is a lightweight package allowing users to follow how an R object is
transformed as it is manipulated by R code. The package abstracts all logging
code from the user, who only needs to specify which objects are logged and what
information should be logged. A few default loggers are included with the
package but the package is extensible through user-defined logger objects.Comment: Accepted for publication in the Journal of Statistical Softwar
Prospects and Challenges in R Package Development
R, a software package for statistical computing and graphics, has evolved into the lingua franca of (computational) statistics. One of the cornerstones of R's success is the decentralized and modularized way of creating software using a multi-tiered development model: The R Development Core Team provides the "base system", which delivers basic statistical functionality, and many other developers contribute code in the form of extensions in a standardized format via so-called packages. In order to be accessible by a broader audience, packages are made available via standardized source code repositories. To support such a loosely coupled development model, repositories should be able to verify that the provided packages meet certain formal quality criteria and "work": both relative to the development of the base R system as well as with other packages (interoperability). However, established quality assurance systems and collaborative infrastructures typically face several challenges, some of which we will discuss in this paper.Series: Research Report Series / Department of Statistics and Mathematic
beadarrayFilter : an R package to filter beads
Microarrays enable the expression levels of thousands of genes to be measured simultaneously. However, only a small fraction of these genes are expected to be expressed under different experimental conditions. Nowadays, filtering has been introduced as a step in the microarray preprocessing pipeline. Gene filtering aims at reducing the dimensionality of data by filtering redundant features prior to the actual statistical analysis. Previous filtering methods focus on the Affymetrix platform and can not be easily ported to the Illumina platform. As such, we developed a filtering method for Illumina bead arrays. We developed an R package, beadarrayFilter, to implement the latter method. In this paper, the main functions in the package are highlighted and using many examples, we illustrate how beadarrayFilter can be used to filter bead arrays
- …